THUIR at NTCIR-9 INTENT Task
نویسندگان
چکیده
This is the first year IR group of Tsinghua University (THUIR) participates in NTCIR. We register the INTENT task and focus on the Chinese topics of subtopic mining and document ranking subtask. In our experiments, we try to mine subtopics from different resources, namely query recommendation, Wikipedia and the query-URL bipartite graph which is constructed by clickthrough data. We also develop some methods to re-rank the subtopics and remove reduplicate ones with query log and search result snippets in search engines. In the document ranking task, methods applied to diversify English documents are used to validate their effectiveness on Chinese pages, such as HITS, Novelty-Result Selection and Documents Duplication Elimination. Based on the new metric, called D#-nDCG, we propose a DocumentDiversification algorithm to select the documents retrieved for subtopics mined in the subtopic mining task, and user browse logs are also leveraged to re-rank these selected results.
منابع مشابه
THUIR at NTCIR-10 INTENT-2 Task
This paper describes our approaches and results in NTCIR10 INTENT-2 task. In this year, we participate in subtasks for both the Chinese and English topics. We extract subtopics from multiple resources for these topics, and several subtopic clustering and re-ranking methods are proposed in this work. In Document Ranking subtask, we redefine the novelty of a document and use the new definition to...
متن کاملTHUIR at NTCIR-12 IMine Task
In this paper, we describe our approaches in the NTCIR12 IMine task, including Chinese Query Understanding and Chinese Vertical Incorporating. In Query Understanding subtask, we propose different strategies to mine subtopic candidates from a wide range of resources and present a twostep method to predict the vertical intent for each subtopic. In Vertical Incorporating subtask, we adopt a probab...
متن کاملRMIT and Gunma University at NTCIR-9 Intent Task
In this report, we describe our experimental results for the NTCIR-9 intent task. For our experiments, we use our experimental search engine, Newt. Newt is a ranked selfindex capable of supporting multiple languages by deferring linguistic decisions until query time. To our knowledge, this is the first Information Retrieval task on the ClueWeb09-JA collection performed entirely with ranked self...
متن کاملOverview of the NTCIR-9 INTENT Task
This is an overview of the NTCIR-9 INTENT task, which comprises the Subtopic Mining and the Document Ranking subtasks. The INTENT task attracted participating teams from seven different countries/regions – 16 teams for Subtopic Mining and 8 teams for Document Ranking. The Subtopic Mining subtask received 42 Chinese runs and 14 Japanese runs; the Document Ranking subtask received 24 Chinese runs...
متن کاملUniversity of Glasgow at the NTCIR-9 Intent task: Experiments with Terrier on Subtopic Mining and Document Ranking
We describe our participation in the subtopic mining and document ranking subtasks of the NTCIR-9 Intent task, for both Chinese and Japanese languages. In the subtopic mining subtask, we experiment with a novel data-driven approach for ranking reformulations of an ambiguous query. In the document ranking subtask, we deploy our state-ofthe-art xQuAD framework for search result diversification.
متن کامل